research-article

WebRobot: web robotic process automation using interactive programming-by-demonstration

Authors:
Rui Dong

University of Michigan, USA

University of Michigan, USA
View Profile

,
Zhicheng Huang

University of Michigan, USA

University of Michigan, USA
View Profile

,
Ian Iong Lam

University of Michigan, USA

University of Michigan, USA
View Profile

,
Yan Chen

University of Toronto, Canada

University of Toronto, Canada
View Profile

,
Xinyu Wang

University of Michigan, USA

University of Michigan, USA
View Profile

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationJune 2022Pages 152–167https://doi.org/10.1145/3519939.3523711

Published:09 June 2022Publication History

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pages 152–167

ABSTRACT

It is imperative to democratize robotic process automation (RPA), as RPA has become a main driver of the digital transformation but is still technically very demanding to construct, especially for non-experts. In this paper, we study how to automate an important class of RPA tasks, dubbed web RPA, which are concerned with constructing software bots that automate interactions across data and a web browser. Our main contributions are twofold. First, we develop a formal foundation which allows semantically reasoning about web RPA programs and formulate its synthesis problem in a principled manner. Second, we propose a web RPA program synthesis algorithm based on a new idea called speculative rewriting. This leads to a novel speculate-and-validate methodology in the context of rewrite-based program synthesis, which has also shown to be both theoretically simple and practically efficient for synthesizing programs from demonstrations. We have built these ideas in a new interactive synthesizer called WebRobot and evaluate it on 76 web RPA benchmarks. Our results show that WebRobot automated a majority of them effectively. Furthermore, we show that WebRobot compares favorably with a conventional rewrite-based synthesis baseline implemented using egg. Finally, we conduct a small user study demonstrating WebRobot is also usable.

References

Cypress Studio. https://docs.cypress.io/guides/core-concepts/cypress-studioGoogle Scholar
iMacros. https://www.progress.com/imacrosGoogle Scholar
Robotic Process Automation (RPA). https://searchcio.techtarget.com/definition/RPAGoogle Scholar
Selenium IDE. https://www.selenium.dev/selenium-ide/Google Scholar
The Remarkable History of Robotic Process Automation (RPA). https://nandan.info/history-of-robotic-process-automation-rpa/Google Scholar
UiPath Webinar. https://www.uipath.com/webinar-recording/your-own-idea-robot-studiox?mkt_tok=OTk1LVhMVC04ODYAAAF8uBLrLqPW-QJHu_Hj1dkXeqK4JMZymY9EGBLkwL_2fSN8Kj2iwc09MVhHrBjf7PUkFUKBfYX-x-85mrFVUXZf2LawwpNcRPLTEDaZ9NM1Google Scholar
UiPath Webinar Slides. https://start.uipath.com/rs/995-XLT-886/images/StudioX_Webinar.pdfGoogle Scholar
XPath. https://en.wikipedia.org/wiki/XPathGoogle Scholar
Simone Agostinelli, Andrea Marrella, and Massimo Mecella. 2020. Towards Intelligent Robotic Process Automation for BPMers. arXiv preprint arXiv:2001.00804.Google Scholar
Tobias Anton. 2005. XPath-Wrapper Induction by Generalizing Tree Traversal Patterns. In Lernen, Wissensentdeckung und Adaptivitt (LWA) 2005, GI Workshops, Saarbrcken. 126–133.Google Scholar
Shaon Barman, Sarah Chasins, Rastislav Bodik, and Sumit Gulwani. 2016. Ringer: Web Automation by Demonstration. In Proceedings of the 2016 ACM SIGPLAN international conference on object-oriented programming, systems, languages, and applications. 748–764.Google ScholarDigital Library
Daniel W Barowy, Sumit Gulwani, Ted Hart, and Benjamin Zorn. 2015. FlashRelate: Extracting Relational Data from Semi-structured Spreadsheets Using Examples. ACM SIGPLAN Notices, 50, 6 (2015), 218–228.Google ScholarDigital Library
Alexander Baumgartner and Temur Kutsia. 2014. Unranked second-order anti-unification. In International Workshop on Logic, Language, Information, and Computation. 66–80.Google ScholarDigital Library
Alexander Baumgartner, Temur Kutsia, Jordi Levy, and Mateu Villaret. 2017. Higher-order pattern anti-unification in linear time. Journal of Automated Reasoning, 58, 2 (2017), 293–310.Google ScholarCross Ref
James M Boyle, Terence J Harmer, and Victor L Winter. 1997. The TAMPR program transformation system: Simplifying the development of numerical software. In Modern software tools for scientific computing. Springer, 353–372.Google Scholar
Sarah Chasins, Shaon Barman, Rastislav Bodik, and Sumit Gulwani. 2015. Browser Record and Replay as a Building Block for End-User Web Automation Tools. In Proceedings of the 24th International Conference on World Wide Web. 179–182.Google ScholarDigital Library
Sarah Elizabeth Chasins. 2019. Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts. Ph.D. Dissertation. UC Berkeley.Google Scholar
Sarah E Chasins, Maria Mueller, and Rastislav Bodik. 2018. Rousillon: Scraping Distributed Hierarchical Web Data. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 963–975.Google ScholarDigital Library
Yan Chen, Jaylin Herskovitz, Walter S Lasecki, and Steve Oney. 2020. Bashon: A Hybrid Crowd-Machine Workflow for Shell Command Synthesis. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 1–8.Google Scholar
Miles Claver, Jordan Schmerge, Jackson Garner, Jake Vossen, and Jedidiah McClurg. 2021. ReGiS: Regular Expression Simplification via Rewrite-Guided Synthesis. arXiv preprint arXiv:2104.12039.Google Scholar
Nachum Dershowitz and Jean-Pierre Jouannaud. 1990. Rewrite systems. In Formal models and semantics. Elsevier, 243–320.Google Scholar
Rui Dong, Zhicheng Huang, Ian Iong Lam, Yan Chen, and Xinyu Wang. 2022. WebRobot: Web Robotic Process Automation using Interactive Programming-by-Demonstration (Extended Version). http://arxiv.org/abs/2203.09993.Google Scholar
Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: interactive program synthesis with control structures. Proceedings of the ACM on Programming Languages, 5, OOPSLA (2021), 1–29.Google ScholarDigital Library
Kasra Ferdowsifard, Allen Ordookhanians, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2020. Small-Step Live Programming by Example. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 614–626.Google ScholarDigital Library
Michael H Fischer, Giovanni Campagna, Euirim Choi, and Monica S Lam. 2021. DIY Assistant: A Multi-Modal End-User Programmable Virtual Assistant. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 312–327.Google ScholarDigital Library
Pankaj Gulhane, Amit Madaan, Rupesh Mehta, Jeyashankher Ramamirtham, Rajeev Rastogi, Sandeep Satpal, Srinivasan H Sengamedu, Ashwin Tengli, and Charu Tiwari. 2011. Web-scale information extraction with vertex. In 2011 IEEE 27th International Conference on Data Engineering. 1209–1220.Google ScholarDigital Library
Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-Output Examples. ACM Sigplan Notices, 46, 1 (2011), 317–330.Google ScholarDigital Library
Rajeev Joshi, Greg Nelson, and Keith Randall. 2002. Denali: A goal-directed superoptimizer. ACM SIGPLAN Notices, 37, 5 (2002), 304–314.Google ScholarDigital Library
Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive Visual Specification of Data Transformation Scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3363–3372.Google ScholarDigital Library
Tessa Lau, Steven A Wolfman, Pedro Domingos, and Daniel S Weld. 2003. Programming by Demonstration Using Version Space Algebra. Machine Learning, 53, 1 (2003), 111–156.Google ScholarDigital Library
Tessa Ann Lau. 2001. Programming by demonstration: a machine learning approach. University of Washington.Google ScholarDigital Library
Tessa A Lau and Daniel S Weld. 1998. Programming by Demonstration: An Inductive Learning Formulation. In Proceedings of the 4th international conference on Intelligent user interfaces. 145–152.Google ScholarDigital Library
Vu Le and Sumit Gulwani. 2014. FlashExtract: A Framework for Data Extraction by Examples. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. 542–553.Google ScholarDigital Library
Volodymyr Leno, Adriano Augusto, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, and Artem Polyvyanyy. 2021. Discovering Executable Routine Specifications from User Interaction Logs. arXiv preprint arXiv:2106.13446.Google Scholar
Volodymyr Leno, Stanislav Deviatykh, Artem Polyvyanyy, Marcello La Rosa, Marlon Dumas, and Fabrizio Maria Maggi. 2020. Robidium: Automated Synthesis of Robotic Process Automation Scripts from UI Logs.Google Scholar
Gilly Leshed, Eben M Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-To Knowledge in the Enterprise. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1719–1728.Google ScholarDigital Library
Henry Lieberman. 1993. Tinker: A programming by demonstration system for beginning programmers. In Watch what I do: programming by demonstration. 49–64.Google Scholar
James Lin, Jeffrey Wong, Jeffrey Nichols, Allen Cypher, and Tessa A Lau. 2009. End-User Programming of Mashups with Vegemite. In Proceedings of the 14th international conference on Intelligent user interfaces. 97–106.Google ScholarDigital Library
Greg Little, Tessa A Lau, Allen Cypher, James Lin, Eben M Haber, and Eser Kandogan. 2007. Koala: Capture, Share, Automate, Personalize Business Processes on the Web. In Proceedings of the SIGCHI conference on Human factors in computing systems. 943–946.Google ScholarDigital Library
Toshiyuki Masui and Ken Nakayama. 1994. Repeat and Predict - Two Keys to Efficient Text Editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 118–130.Google ScholarDigital Library
Dan Hua Mo. 1990. Learning Text Editing Procedures from Examples.Google Scholar
Aaditya Naik, Jonathan Mendelson, Nathaniel Sands, Yuepeng Wang, Mayur Naik, and Mukund Raghothaman. 2021. Sporq: An Interactive Environment for Exploring Code using Query-by-Example. In The 34th Annual ACM Symposium on User Interface Software and Technology. 84–99.Google ScholarDigital Library
Chandrakana Nandi, Max Willsey, Adam Anderson, James R Wilcox, Eva Darulova, Dan Grossman, and Zachary Tatlock. 2020. Synthesizing structured CAD models with equality saturation and inverse transformations. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 31–44.Google ScholarDigital Library
Julie L Newcomb and Rastislav Bodik. 2019. Using human-in-the-loop synthesis to author functional reactive programs. arXiv preprint arXiv:1909.11206.Google Scholar
Don Norman. 2013. The design of everyday things: Revised and expanded edition. Basic books.Google Scholar
Besmira Nushi, Ece Kamar, Eric Horvitz, and Donald Kossmann. 2017. On human intellect and machine failures: Troubleshooting integrative machine learning systems. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Shankara Pailoor, Yuepeng Wang, Xinyu Wang, and Isil Dillig. 2021. Synthesizing data structure refinements from integrity constraints. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 574–587.Google ScholarDigital Library
Pavel Panchekha, Alex Sanchez-Stern, James R Wilcox, and Zachary Tatlock. 2015. Automatically improving accuracy for floating point expressions. ACM SIGPLAN Notices, 50, 6 (2015), 1–11.Google ScholarDigital Library
Varot Premtoon, James Koppel, and Armando Solar-Lezama. 2020. Semantic code search via equational reasoning. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 1066–1082.Google ScholarDigital Library
Saikat Ray, Arthur Villa, Naved Rashid, Paul Vincent, Keith Guttridge, and Melanie Alexander. 2021. Magic Quadrant for Robotic Process Automation. https://www.gartner.com/doc/reprints?id=1-26Q65VFT&ct=210706&st=sbGoogle Scholar
Mohammad Raza and Sumit Gulwani. 2020. Web Data Extraction using Hybrid Program Synthesis: A Combination of Top-down and Bottom-up Inference. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1967–1978.Google ScholarDigital Library
Mark Santolucito, William T Hallahan, and Ruzica Piskac. 2019. Live programming by example. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–4.Google ScholarDigital Library
Kensen Shi, Jacob Steinhardt, and Percy Liang. 2019. Frangel: component-based synthesis with control structures. Proceedings of the ACM on Programming Languages, 3, POPL (2019), 1–29.Google ScholarDigital Library
Calvin Smith and Aws Albarghouthi. 2019. Program synthesis with equivalence reduction. In International Conference on Verification, Model Checking, and Abstract Interpretation. 24–47.Google ScholarCross Ref
Armando Solar-Lezama. 2008. Program synthesis by sketching. University of California, Berkeley.Google ScholarDigital Library
Reudismam Sousa, Gustavo Soares, Rohit Gheyi, Titus Barik, and Loris D’Antoni. 2021. Learning Quick Fixes from Code Repositories. In Brazilian Symposium on Software Engineering. 74–83.Google ScholarDigital Library
Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality saturation: a new approach to optimization. In Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 264–276.Google ScholarDigital Library
Alexa VanHattum, Rachit Nigam, Vincent T Lee, James Bornholt, and Adrian Sampson. [n.d.]. Vectorization for Digital Signal Processors via Equality Saturation Extended Abstract.Google Scholar
Eelco Visser, Zine-el-Abidine Benaissa, and Andrew Tolmach. 1998. Building program optimizers with rewriting strategies. ACM Sigplan Notices, 34, 1 (1998), 13–26.Google ScholarDigital Library
Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, and Isil Dillig. 2019. Visualization by example. Proceedings of the ACM on Programming Languages, 4, POPL (2019), 1–28.Google Scholar
Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, and Amy J Ko. 2021. Falx: Synthesis-Powered Visualization Authoring. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.Google ScholarDigital Library
Judith Wewerka and Manfred Reichert. 2020. Robotic Process Automation–A Systematic Literature Review and Assessment Framework. arXiv preprint arXiv:2012.11951.Google Scholar
Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary Tatlock, and Pavel Panchekha. 2021. egg: Fast and Extensible Equality Saturation. Proceedings of the ACM on Programming Languages, 5, POPL (2021), 1–29.Google ScholarDigital Library
Yichen Yang, Phitchaya Phothilimthana, Yisu Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality saturation for tensor graph superoptimization. Proceedings of Machine Learning and Systems, 3 (2021), 255–268.Google Scholar
Dell Zhang, Alexander Kuhnle, Julian Richardson, and Murat Sensoy. 2020. Process Discovery for Structured Program Synthesis. arXiv preprint arXiv:2008.05804.Google Scholar
Tianyi Zhang, London Lowmanstone, Xinyu Wang, and Elena L Glassman. 2020. Interactive Program Synthesis by Augmented Examples. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 627–648.Google ScholarDigital Library

Index Terms

WebRobot: web robotic process automation using interactive programming-by-demonstration
1. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Automatic programming

Recommendations

Ringer: web automation by demonstration
OOPSLA '16

With increasing amounts of data available on the web and a diverse range of users interested in programmatically accessing that data, web automation must become easier. Automation helps users complete many tedious interactions, such as scraping data, ...
Read More
MIWA: Mixed-Initiative Web Automation for Better User Control and Confidence
UIST '23: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology

In the era of Big Data, web automation is frequently used by data scientists, domain experts, and programmers to complete time-consuming data collection tasks. However, developing web automation scripts requires familiarity with a programming language ...
Read More
Web Navigation Sequences Automation in Modern Websites
DEXA '09: Proceedings of the 20th International Conference on Database and Expert Systems Applications

Most today's web sources are designed to be used by humans, but they do not provide suitable interfaces for software programs. That is why a growing interest has arisen in so-called web automation applications that are widely used for different purposes ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation
June 2022
1038 pages
ISBN:9781450392655
DOI:10.1145/3519939
General Chair:
Ranjit Jhala
University of California at San Diego, USA
,
Program Chair:
Işil Dillig
University of Texas at Austin, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Human-in-the-loop
Program Synthesis
Programming by Demonstration
Rewrite-based Synthesis
Robotic Process Automation
Web Automation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate406of2,067submissions,20%
Upcoming Conference
PLDI '24

Sponsor:

sigplan

ACM SIGPLAN Conference on Programming Language Design and Implementation

June 24 - 28, 2024

Copenhagen , Denmark
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 363
  Total Downloads
- Downloads (Last 12 months)174
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

WebRobot: web robotic process automation using interactive programming-by-demonstration

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Ringer: web automation by demonstration

MIWA: Mixed-Initiative Web Automation for Better User Control and Confidence

Web Navigation Sequences Automation in Modern Websites

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

WebRobot: web robotic process automation using interactive programming-by-demonstration

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Ringer: web automation by demonstration

MIWA: Mixed-Initiative Web Automation for Better User Control and Confidence

Web Navigation Sequences Automation in Modern Websites

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media